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Overlay  modelling  is  a technique  for  describing  a student's  problem 
solving  skills  in  terms  of  a modular  program  designed  to  bo  an  expert  for  the 
given  domain.  The  model  is  an  overlay  on  the  expert  program  in  that  it 
consists  of  a set  of  hypotheses  regarding  the  student's  familiarity  with  the 
skills  employed  by  the  expert.  The  modelling  is  performed  by  a set  of  P rules 
that  are  triggered  by  different  sources  of  evidence,  and  whose  effect  is  to 
modify  these  hypotheses.  A P critic  monitors  these  rules  to  detect 
discontinuities  and  inconsistencies  in  their  predictions. 

I 

A first  implementation  of  overlay  modelling  exists  as  a component  of 
WUSOR-II,  a CAI  program  based  on  artifical  intelligence  techniques.  WUSOR-II 
coaches  a student  in  the  logical  and  probability  skills  required  to  play  the 
computer  game  VAJHPUS.  Preliminary  evidence  indicates  that  overlay  modelling 
significantly  improves  the  appropriateness  of  the  tutoring  program's 
explanations. 


This  report  describes  research  done  at  the  Artificial  Intelligence  Laboratory 
of  the  Nassachusetts  Institute  of  Technology.  It  was  supported  in  part  by  the  Advanced 
Research  Projects  Agency  of  the  Department  of  Defense  under  Office  of  Naval  Research 
contract  N00014-75-C’0643,  and  in  part  by  the  Division  for  Study  and  Research  in 
Education.  Nassachusetts  Institute  of  Technology. 


1 This  paper  has  been  submitted  to  the  Fifth  International  Joint  Conference  on  Artificial 
Intelligence. 
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1.  The  HIT  COACH  Project 

A traditional  argument  for  computer  aided  Instruction  (CAI)  has  been  that  It  is  an 
economic  means  for  providing  Individualized  Instruction.  The  rapidly  falling  co.vts  of 
hardware  make  the  economics  of  CAI  progressively  more  appealing.  But.  the  extent  to 
which  existing  CAI  has  provided  personalized  instruction  has  been  limited.  This  paper 
develops  a procedural  theory  of  modelling  that  can  be  Incorporated  into  CAI  programs  to 
address  this  limitation. 

This  theory  has  been  developed  as  part  of  the  COACH  Project  at  HIT.  whose  concern 
is  the  development  of  Al-based  CAI  programs  for  tutoring  the  skills  required  for 
successfully  playing  various  computer  games.  The  computer  serves  as  an  assistant  to  a 
learner  who  is  in  the  process  of  acquiring  the  skills  necessary  to  play  the  game  well. 
Fig.  1 shows  a generalized  block  diagram  for  these  programs,  with  the  modules  given 
anthropomorphic  names  to  indicate  their  function.  To  distinguish  them  fr  . their  human 
counterparts,  references  to  the  modules  will  be  capitalized. 

Good  coaching  is  critically  dependent  on  a detailed  model  of  the  learner  in  that 
the  model  guides  the  coach  in  generating  concise  and  appropriate  explanations.  This 
paper  discusses  the  theory  of  overlay  modelling  embodied  in  the  Psychologist  module, 
the  component  of  the  Coach  responsible  for  maintaining  such  models  of  player's 
current  skills  (the  k model)  and  learning  preferences  (the  L model).  These  models  are 
used  by  the  Tutor  module  to  prune  complex  explanations  generated  by  the  Expert.  Just 
as  with  a human  speaker,  the  Coach  abbreviates  its  statements  by  eliminating  those 
facts  that  are  already  known  by  the  listener  and  those  facts  which  are  too  complex. 

A broad  treatment  of  the  potential  role  of  computer  coaches  in  education  and  the 
Issues  raised  by  their  design  is  given  in  [Goldstein  1977].  Detailed  discus  ^ons  of 
preliminary  implementations  and  experimental  results  are  provided  in  [Stansfield,  Carr 
and  Goldstein  1976]  and  [Carr  1977].  Seminal  work  on  Al-based  CAI  is  also  described  in 
[Brown  et  al.  197S;  Brown  1976;  Collins  A Grignetti  1975].  In  particular,  overlay 
modelling  is  an  extension  of  the  issue-oriented  approach  to  student  modelling  developed 
by  Burton  and  Brown  [1975]. 

Overlay  modelling  is  a technique  for  recognizing  the  constituent  skills  being 


Carr  & Goldstein 


Overlay  Modelling 


EXPLICIT  — 
BACKGROUND - 
IMPLICIT  — 

-structural- 


computer  COACH 


PSYCHOLOGIST 


Amouu  .oicUcate  dtUa  itow. 

II  Ii  ofLe.  daXa  AtAuctuAu, 


OVERLAY  WEIGHTS 


L 

K & L MODELS 

- COMPLEXITY - 
tOVE  ANALYSIS- 


PLAYER'S  MOVES 


BACKGROUND 


TUTOR  

-f — 

'A  INTERACTION 


PIG.  1 — BLOCK  DIAGRAM  OF  A COMPUTER  COACH 


ACCESSION  for 

NTIS 

White  Sectton 

DOC 

Buff  Section  □, 

UNANNOUNCED 

□ 

JUSTIFICATION 

DCnilBUTION/AVAIlABIllTY  COOES 


I Carr  & Goldstein  4 Overla:  Nodelllng 

[ exercised  by  an  individual  In  performing  a problem  solving  task.  The  kerne,  idea  is  to 

I design  a modular  Expert  program  for  the  task,  and  to  explain  differences  between  the 

^ behavior  of  the  Expert  and  the  subject  in  terms  of  the  lack,  on  ti.  player p<.rt,  of 

some  of  the  Expert's  skills.  Thus,  a model  of  the  player  is  a set  of  hypotheses,  each 

of  which  records  the  system's  confidence  that  the  player  possesses  a given  skill.  Such 

■odels  are  called  overlays  to  reflect  that  fact  that  the  model  of  the  Indi  iaual  is 

basically  a perturbation  on  the  Expert's  structure. 

Overlays  in  terms  of  subsets  of  the  Expert's  skills  is  a simplification  of  the 
modelling  problem  in  that  it  does  not  address  situations  in  which  the  student 
has  an  incorrect  skill  or  an  alternative  skill.  A discussion  of  this 
limitation  is  given  in  section  5. 

Nodelling  a learner  is  difficult.  However,  preliminary  evidence  with  UUSOR-II 
indicates  that,  at  least  for  the  restricted  environment  of  a game  and  for  the  limited 
purpose  of  guiding  a tutor,  adequate  modelling  can  be  obtained  from:  a rule-  system 
that  accesses  multiple  sources  of  evidence,  and  a critic  that  detects  inconsistencies 
and  discontinuities  in  the  player's  behavior.  Fig.  2 is  a block  diayram  of  the 
internal  structure  of  the  Psychologist. 


Carr  k Goldstain 


S 


Overlay  nodalllng 


Saction  2 describes  the  Wuapus  gaae,  the  experlaental  donain  of  the  WUSOR-II 
coach.  The  theory  of  overlay  aodelling  is  developed  next  (sections  3-4),  followed  by  a 
discussion  of  its  llaitations  and  extensions  (sections  S-7),  and  concluding  with  our 

\ 

experlaental  prograa  and  preliainary  results  (sections  8).  Related  literature  is 
surveyed  in  section  9. 

i 

2.  Wuapus.  an  Intellectual  Game 

The  VUapus  gaae  was  invented  by  Gregory  Yob  [1975]  and  exercises  basic  knowledge 

I 

of  logic,  probability,  decision  analysis  and  geoaetry.  Players  ranging  from  children 
to  adults  find  it  enjoyable.  The  gaae  is  a aodern  day  version  of  Thoseus  and  the 
Minotaur.  The  player  is  initially  placed  soaewhere  in  a randoaly  connected  warren  of 
caves  and  told  the  neighbors  of  his  current  location.  His  goal  is  to  lot. ate  the  horrid 
Wuapus  and  slay  it  with  an  arrow.  Each  aove  to  a neighboring  cave  yields  Information 
regarding  that  cave's  neighbors.  The  difficulty  in  choosing  a move  arises  froa  the 
i existence  of  dangers  in  the  warren  — bats,  pits  and  the  Wuapus  itself.  If  the  player 

^ moves  into  the  Wuapus'  lair,  he  is  eaten.  If  he  walks  into  a pit,  he  foils  to  his 

death.  Bats  pick  the  player  up  and  randomly  drop  hia  elsewhere  in  the  warren. 

But  the  player  can  alnlaize  risk  and  locate  the  Wuapus  by  aaklng  the  'oper 
logistic  and  probabilistic  inferences  froa  warnings  he  is  given.  These  warm,  s are 
provided  whenever  the  player  is  in  the  vicinity  of  a danger.  The  Wumpus  can  be  smelled 
within  one  or  two  caves.  The  squeak  of  hats  can  be  heard  one  cave  away  cn..  the  breeze 
of  a pit  felt  one  cave  away.  The  game  is  won  by  shooting  an  arrow  into  ..  Wumpus 's 
lair.  If  the  player  exhausts  his  set  of  five  arrows  without  hitting  the  creature,  the 
gaae  is  lost.  Fig.  3 illustrates  a typical  interaediate  state  a player  might  reach. 

Skilled  play  exercises  knowledge  of  logic,  probability,  decision  theory  and 
geoaetry.  The  WUSOR-II  Expert  uses  a rule-based  representation  of  this  knowledge, 
consisting  of  approxiaately  20  rules,  to  infer  the  risk  of  visiting  new  caves. 
However,  for  expository  purposes,  a siaplifled  rule  set  consisting  of  flv  reasoning 


skills  is  sufficient  to  illustrate  overlay  aodelling. 


til  (positive  evidence  rule)  A warning  in  a cave  implies  that  a dangvr  exists 
in  a neighbor. 

L2i  (negative  evidence  rule)  The  absence  of  a warning  implies  that  no  danger 
exists  in  any  neighbors. 

LSi  (elimination  rule)  If  a cave  has  a warning  and  all  but  one  of  its 
neighbors  are  known  to  be  safe,  then  the  danger  is  in  the  remaining 
neighbor. 

Pit  (egual  likelihood  rule)  In  the  absence  of  other  knowlea^e,  all  of  ..he 


neighbors  of  a cave  with  a warning  are  equally  likely  to  contain  a danger. 
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F2t  {double  evidence  rule)  Multiple  uernings  increase  the  likelihood  that  a 
given  cave  contains  a danger. 

In  terns  of  these  skills,  an  overlay  nodel  for  a player  who  mastered  the 
simple  logical  rules  (Ll.LZ),  is  in  the  process  of  acquiring  L3,  and  has  not  yet 
learned  PZ  is: 


MOLES 

APPMOPMIATE 

USED 

FMEQUEHCY 

ROCMO 

Li 

6 

6 

iOOX 

T 

L2 

4 

3 

76% 

T 

L3 

4 

2 

60% 

? 

PI 

6 

6 

100% 

T 

P2 

4 

1 

26% 

MIL 

Overlay  Model  1 


The  frequencies  are  determined  by  estimates  made  by  the  P rules  of  the  number  of  times 
a skill  has  been  USED  in  proportion  to  the  number  of  times  it  has  been  APPROPRIATE. 
The  KNOWN  variable  is  set  to  T,  7 or  NIL  by  the  P critic. 

The  WUSOR-II  Coach  [Carr  77]  maintains  models  of  this  kind  for  guiG...g  its 
explanations  to  the  student.  For  example,  consider  fig.  4:  Suppose  the  >>...yer  moves 


LiJ 


to  cave  14,  the  worst  possible  move.  Given  overlay  nodel  1,  WUSOR-II  would  generate 
the  following  tutorial  advice: 
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Ira,  it  isn't  necessary  to  take  such  large  risks  with  pits.  One  of  coves  2 
and  14  contains  a pit.  Likewise  one  of  caves  0 and  14  contains  a pit.  This 
is  multiple  evidence  of  a pit  in  cave  14  which  makes  it  probable  that  cave  14 
contains  a pit.  It  is  less  likely  that  cave  0 contains  a pit.  Hence.  Ira,  we 
might  want  to  explore  cave  0 instead. 

Without  the  overlay  model,  the  explanation  would  be  longer  and  more  c mplex  as  shorn 
below.  The  WUSOR-II  Tutor  has  pruned  the  underlined  text  from  the  Expert’s  complete 
analysis  by  noting  that  the  student  is  already  familiar  with  the  positive  and  negative 
evidence  rules. 

Ira,  it  isn’t  necessary  to  take  such  large  risks  with  pits. 

Cave  4 must  be  next  to  a pit  because  we  felt  a draft  there.  Hence,  one 
caves  15,  2 and  14  contains  a pit,  but  we  have  safely  visited  cave  16  This 
means  that  one  of  caves  2 and  14  contains  a pit. 

Likewise  cave  16  must  be  next  to  a pit  because  we  felt  a draft  there. 
Hence,  one  of  caves  0,  4 and  14  contains  a pit,  but  we  have  safely  vi^  ited 
cave  4.  This  means  that  one  of  caves  0 and  14  contains  a pit. 

This  is  multiple  evidence  of  a pit  in  cave  14  which  makes  it  probable  that 
cave  14  contains  a pit.  It  is  less  likely  that  cave  0 contains  a pit.  Hence, 

Ira,  we  might  want  to  explore  cave  0 instead. 

Thus,  the  overlay  model  has  allowed  the  tutor  to  focus  on  explaining  the  double 
evidence  heuristic  to  the  player. 


3.  The  P Rules 

No  single  source  of  evidence  is  a certain  indicator  of  an  individual's  knowledge. 
Hence,  the  Psychologist  is  provided  with  four  sources  of  evidence  ■-  (1)  implicit  (the 
student's  behavior  In  playing  the  game),  (2)  structural  (the  inir  nsic  complexity 
relations  between  skills  of  the  Expert),  (3)  explicit  (the  dialog  between  tutor  and 
player),  and  (4)  background  (estimates  of  how  average  players  of  varying  backgrounds 


can  be  expected  to  perform). 
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In  this  section,  we  define  the  P rules,  a set  of  procedures  wh^ch  modify  the 
overlay  model  when  triggered  by  these  various  kinds  of  evidence.  Section  4 describes 
the  P Critic  whose  function  is  to  set  the  KNOWN  variable  on  the  basis  of  the  history  of 
changes  to  USED  and  APPROPRIATE.  In  these  sections,  our  example  is  the  creation  and 
maintenance  of  the  K model,  an  overlay  on  the  Expert.  [Goldstein  77]  describes  the 
application  of  overlay  techniques  to  the  creation  and  maintenance  of  the  L model,  an 
overlay  on  the  Tutor. 

Implicit  Evidence;  The  student's  play  yields  implicit  evidence  regarding  his 
mastery  of  various  skills.  The  Expert  evaluates  the  merits  of  the  player's  move 
relative  to  the  available  alternatives.  The  assumption  is  that  the  player  has  learned 
those  skills  involved  in  choosing  his  particular  move  and  rejecting  its  Inferiors,  and 
has  yet  to  learn  those  skills  needed  to  recognize  superior  moves. 

The  implicit  evidence  rules  utilize  the  Expert's  analysis  as  follows: 

P~Iit  If  skill  S is  involved  in  an  overlooked  superior  move  and  not  in  the 
current  move,  then  increase  APPROPRIATE  by  C(S)  and  recompute  the 
frequency. 

P-I2i  If  skill  S is  involved  in  the  current  move  and  not  a rejected  inf.,  ior, 
then  increase  USED  and  APPROPRIATE  by  C(S)  and  recompute  the  frequency . 

where  C(S)  is  a complexity  factor  ranging  between  0 one  t that 
decreases  as  the  skill  becomes  more  complex  relative  to  the  student's 
current  knowledge  state.  C(S}  is  defined  in  the  next  section 


For  example,  in  situation  1 the  Expert  reports  to  the  Psychologist  that  caves  0 
and  2 are  better  than  14  on  the  basis  of  double  evidence  (P2).  If  the  player  chooses 
14,  the  Expert's  analysis  triggers  P-Il  which  increments  APPROPRIATE  but  not  USED. 
(The  FREQUENCY  of  P2  therefore  drops.)  On  the  other  hand,  if  the  player  choso  cave  0 
or  2,  then  P-12  would  be  triggered  and  both  USED  and  APPROPRIATE  for  P2  would  increase. 


The  Expert  also  reports  to  the  Psychologist  that  cave  0 is  better  than  cave  2 


Carr  & Goldstein 


10 


Overlay  Modelling 


because  of  the  known  bat  in  the  latter.  Hence  if  the  player  chooses  Z,  P-ll  is 
triggered  and  the  frequency  of  use  of  L3  drops,  while  choosing  0 has  the  opposite 
effect. 

Structural  Evidence;  Clues  to  the  student's  knowledge  arise  from  an  analysis  of 
the  Intrinsic  structure  of  the  skills  to  be  conveyed.  This  analysis  of  the  Expert's 
skills  is  stored  as  the  Syllabus,  a network  linking  the  skills  in  terms  of  their 
complexity  and  dependencies.  Fig.  5 is  a simplified  Wumpus  syllabus  for  the  five 
reasoning  skills  introduced  earlier. 


Structural  knowledge  suggests  that  given  a student  familiar  with  a certain  region 
of  the  syllabus  (as  indicated  by  the  k model),  it  is  more  likely  that  a new  skill  being 
acquired  is  at  the  frontier  of  this  region  rather  than  deep  into  unknown  territory. 
WUSOR-II  implements  this  heuristic  in  a conservative  fashion:  C(S)  is  set  to  zero  for 
every  skill  more  than  one  away  from  a known  skill.  VOJSOR-II  thus  Ignor.  s the  possible 
employment  of  skills  not  at  the  frontier. 

We  currently  believe  that  this  is  too  conservative.  It  assumes  that  skills  can 
only  be  learned  in  the  order  in  which  they  appear  in  the  syllabus.  Such  an  assumption 
is  too  strong  as  the  syllabus  is  only  a guideline.  Double  evidence  might  be  employed 
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despite  non-mastery  of  the  elimination  strategy.  Hence,  our  current  plans  call  for 
redefining  C(S)  to  decrease  in  proportion  to  how  far  a skill  is  from  the  student's 
current  knowledge  state. 

C(S)  s 1 where  D(S)  is  the  distance 

of  S from  the  farthest  known 

D(S)  skill. 

D(S)  is  the  distance  from  the  farthest,  not  nearest  known  skill  since  the  use  of  S 
depends  on  all  the  skills  linked  to  it.  S may  be  linked  to  several  skills,  all  but  one 
of  which  are  known.  The  unknown  skills  then  becomes  the  critical  piece  of  knowledge. 

For  example,  consider  again  a player  moving  to  cave  0 in  situation  1.  The  implicit 
evidence  rule  P-12  is  triggered  by  the  apparent  use  of  double  evidence  P2.  But  with 
respect  to  overlay  model  1,  the  earlier  skill  L3  has  not  yet  been  learned.  Hence,  D(S) 
« 2 and  therefore  the  change  to  USED  and  APPROPRIATE  is  reduced  by  50%.  The 
Psychologist  is  being  cautious  in  interpreting  this  apparently  advanced  behavior  as 
evidence  of  a non-local  improvement  in  the  player's  skill.  However,  this  possibility 
Is  not  ignored. 


Explicit  Evidence;  Another  source  of  evidence  can  be  obtained  from  the  player's 
response  to  questions  asked  by  the  Tutor.  This  capability  is  not  currently  implemented 
in  WUSOR-II.  We  have  plans  to  implement  a facility  for  the  Tutor  to  obtain  explicit 
evidence  by  asking  the  student  two  types  of  questions:  test  cases  and  follow  up 
questions. 

In  a test  case  question,  the  tutor  will  ask  the  student  to  order  the  moves  for  the 
current  board  state  or  a test  case.  Analyzing  the  response  reduces  to  the  Implicit 
Evidence  case,  except  that  there  is  a larger  window  into  the  player's  reasoning.  The 
Psychologist  need  not  guess  that  the  student  has  overlooked  superior  move  and  rejected 
Inferior  moves:  the  evidence  is  explicit  in  the  requested  ordering.  The  possibility 


that  the  student  has  forgotten  to  consider  one  alternative  (which  might  happen  in  a 
complex  game  situation)  is  precluded.  For  example,  situation  1 might  serve  as  a test 
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case  In  conjunction  with  the  following  question: 

Which  of  the  following  statements  do  you  agree  with  mostt 

(1)  Caves  0,  2 and  14  are  egually  safe. 

(2)  Caves  0 and  2 are  equally  safe,  but  cave  14  is  more  dangerous. 

(3)  Cave  0 is  safer  than  both  2 and  14. 

The  second  kind  of  explicit  evidence  will  be  derived  from  follow  up  questions  that 
ask  the  student  to  choose  among  a set  of  possible  rationales  for  why  the  current  move 
was  chosen.  Rule  P-El  will  monitor  this  source  of  evidence. 

P-Elt  If  a player  chooses  the  wrong  rationale,  then  increment  APPROPRIATE  by  1 
for  each  skill  S involved  in  the  correct  rationale  but  absent  in  the 
chosen  rationale. 

For  example,  a follow  up  question  to  a move  to  cave  0 in  situation  1 .night  be: 

Which  of  the  following  explanations  apply/ 

(1)  Caves  0,  2 and  14  are  equally  safe. 

(2)  Caves  0 and  2 are  equally  safe,  but  cave  14  is  more  dangerous  because 
there  is  double  evidence  for  pits  in  14  and  only  single  evidence  for  0 
and  2.  Otherwise  0 and  2 are  the  same. 

(3)  Cave  0 is  safer  than  2 because  there  is  a bat  in  2 but  no  bat  in  0. 

Background  Evidence:  Every  teacher  has  expectations  about  the  performance  of  a 
student  on  the  basis  of  that  student's  background.  This  estimate  changes  as  experience 
with  the  student  is  acquired,  but  it  provides  a useful  starting  point. 

In  the  first  implementation  of  the  WDSOR  coach,  the  Psychologist  asked  the  player 
to  classify  himself  his  level  of  skill  as  either  "novice",  "amateur",  "advanced"  or 

1 

"expert".  Each  of  these  skill  levels  corresponded  to  a different  initialisation  for 
the  overlay  model. 


We  are  currently  experimenting  with  a set  of  background  rules  that  associate 
different  starting  states  for  the  overlay  model  with  different  replies  to  a 
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questlonanalre  presented  to  the  player  at  the  beginning  of  his  first  game.  These  rules 
are  triggered  by  the  player's  age  and  experience  with  the  game.  For  example,  the  three 
rules  for  a secondary  school  player  are: 

f-Clt  If  the  player  is  in  secondary  school  with  no  previous  experience . then 
initialize  the  K model  to  AMATEUR,  i.e.  familiarity  uith  the  skills  LI 
(*  evidence)  and  PI  (*  likelihood) . 

P~C2t  If  the  player  is  in  secondary  school  and  has  had  i~10  games  experience 
without  coaching,  then  initialize  the  K model  to  ADVAkCED,  i.e.  assume 
familiarity  with  Li,  L2,  L$  and  PI. 

P“C3i  If  the  player  is  in  secondary  school  and  has  had  over  10  games 
experience  without  coaching,  then  initialize  the  K model  to  EXPERT,  i.e. 
assume  familiarity  with  all  LI,  L2,  L3,  PI  and  P2. 

Similar  rules  are  used  for  pre*  and  post-secondary  school  players.  The  rules 
associate  naturally  bounded  portions  of  the  syllabus  (as  determined  by  dependency  and 
complexity  criteria)  to  various  age  and  skill  backgrounds.  We  do  not  yet  have  enough 
experience  with  these  background  rules  to  know  whether  the  categories  of  experience  we 
have  chosen  are  reasonable.  We  plan  to  acquire  this  experience  studying  whether  the 
Implicit  and  explicit  rules  find  a particular  background  skill  estimate,  on  the 
average,  too  high  or  too  low  for  players  of  a given  background. 

^ 4.  The  P Critic 

The  Psychologist  maintains  a history  of  changes  to  the  USED  and  APPROPRIATE 
variables  in  order  to  detect  inconsistencies  and  discontinuities.  Inconsistencies  are 
evidence  that  the  P rules  are  failing  to  model  the  student  properly,  while 
discontinuities  are  indications  of  a change  in  the  players  knowlt  stai  . The  P 
Critic  makes  these  decisions. 


Fig.  6 is  a history  graph  for  skill  8.  The  graph  is  ideal  in  the  sense  that  the 
player  consistently  fails  to  use  skill  8 in  situations  Judged  appropriate  ry  the 
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Expert,  until  point  X at  which  he  thereafter  consistently  employs  the  skill.  There  is 
no  occasional  use  of  the  skill.  The  P Critic  would  set  KNOWN  to  T shortly  after  point 
X. 

Real  situations  are  not  this  clear  cut;  hence  a certain  tolerance  is  allowed  as 
shown  in  fig.  7.  A slope  of  zero  to  10  degrees  results  in  KNOWN  being  set  to  NIL.  A 
slope  of  35  to  45  degrees  degrees  is  sufficient  for  the  critic  to  set  KNOWN  to  T. 
Between  these  two  regions,  KNOWN  is  set  to  7. 

"7*  reflects  uncertainty  on  the  part  of  the  Psychologist.  The  student  nay  be  in 
the  process  of  acquiring  the  skill,  and  not  yet  able  to  use  it  consistently.  Or  the  P 
rules  may  be  failing  to  model  the  student  properly. 

When  KNOWN  * 7,  tie  Tutor  module  of  the  Coach  becomes  cautious  about  assuming  that 
the  student  knows  the  skill  even  in  situations  where  the  student  chooses  the  proper 
move.  Explicit  evidence  is  sought  by  means  of  follow  up  questions  in'  ring  about  the 
student's  rationale.  In  the  event  that  no  clarification  is  obtained,  i.e.  the  student 
Is  inconsistent  even  on  these  questions,  the  Tutor  will  ultimately  ignore  the 
Psychologist  on  this  skill.  The  result  is  that  the  Coach  is  reduced  to  providing 
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explanations  generated  by  the  Expert  (when  the  student  Mk.es  a non-optiinal  move)  that 
are  unpruned  with  respect  to  this  skill. 

3.  Liaitations 

The  modelling  being  conducted  by  the  Psychologist  rests  on  the  assumption  chat  the 
skills  employed  by  the  student  are  a subset  of  those  of  the  Expert.  Thi..  is  not 
Inevitable  for  at  least  three  reasons.  First,  the  student  My  be  solving  problems  in  a 
fashion  completely  divergent  from  the  Expert  ••  there  can  be  multiple  paradigms  for  the 
particular  problem  domain.  Second,  the  student  My  be  using  a non*optimal  method  for 
his  own  reasons.  A Wumpus  player  My  be  more  concerned  with  finishing  quickly  than 
avoiding  risk,  and  hence  choose  a move  to  a more  informative  cave,  .^.espite  greater 
risk.  Third,  the  student  may  possess  a skill  of  the  Expert  in  an  incorrect  form, 
perhaps  using  it  inappropriately. 

We  have  sought  to  make  the  Expert  a useful  foundation  . mode. ling  by 
Imposing  certain  design  criteria  on  its  design.  The  Mjor  one  is  the  use  t.'  a rule 
system  to  represent  the  heuristics  commonly  employed  by  skilled  players.  This  o.iproach 
has  been  profitably  employed  in  the  Mdlcal  doMin  [Shortliffe  74]  and  we  similarly 
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find  It  a useful  framework  in  which  to  nodularly  represent  human  skill.  By 
Interviewing  skilled  players  and  by  Introspection,  we  have  evolved  a rule  system  whose 
reasoning  is  acceptable  to  skilled  players  as  capturing  the  essential  ingredients  of 
their  own  analyses.  In  this  fashion,  we  have  constructed  an  Expert  for  the  game  of 
Wumpus  that  provides  a reasonable  basis  for  modelling. 

For  the  restricted  decision  making  environment  of  Wumpus,  we  have  not  encountered 
multiple  problem  solving  paradigms  nor  have  we  found  it  common  for  a student  to  ignore 
the  basic  strategy  of  choosing  the  safest  unvisited  cave.  However,  for  othe.  domains 
such  as  mathematical  problem  solving,  the  possibility  of  multiple  models  of  expertise 
exists.  It  is  a fundamental  limitation  of  overlay  modelling  that  a player  'annot  be 
modelled  who  employs  a logic  not  understood  by  the  Expert.  Indeed,  a humar.  teacher 
cannot  understand  a student  reasoning  in  a legitimate  fashion  unknowr.  to  the  teacher. 
The  power  of  a successful  teacher  arises  from  knowing  multiple  means  for  sclving  a 
given  problem,  and  hence  being  sensitive  to  the  particular  choice  made  by  the  student. 
The  same  possibility  is  available  to  the  Coach,  if  a Meta-Expert  is  provic.  A Neta- 
Expert  is  a set  of  Experts  for  the  given  task,  each  modular,  articu  .te,  and 
comprehensible;  and  each  capable  of  supplying  a move  analysis  from  its  own 
perspective. 

With  a Meta-Expert,  the  Psychologist  can  attempt  to  identify  which  Expert  the 
Student  most  closely  approximates.  The  evidence  distinguishing  the  experts  derives 
from  those  situations  where  the  predictions  of  the  Experts  differ.  However,  the  cost 
of  multiple  experts  is  one  more  source  of  uncertainty.  We  have  avoided  this  difficulty 
to  date  by  choosing  a tutoring  situation  — Wumpus  — where  there  is  broad  agreement 
upon  the  part  of  Expert  players  as  to  the  necessary  skills.  The  design  of  Coaches  with 
a Meta-Expert  module  is  a future  research  goal. 

Meta-experts,  however,  do  not  address  the  modelling  difficulties  arls^:.g  when  the 
student  employs  a skill  in  an  incorrect  form.  For  example,  we  have  found  some  students 
to  employ  the  positive  and  negative  evidence  skills  for  bats  and  pits  but  not  for  the 
Wumpus.  The  reason  presumably  is  the  greater  simplicity  resulting  fror..  the  fact  that 
bat  and  pit  warnings  propagate  only  one  cave,  while  the  Wumpus  warning  propagates  two 
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caves.  We  have  addressed  this  problea  by  not  organizing  the  Expert  around  the  most 
general  set  of  skills.  Rather  positive  and  negative  evidence  has  been  represented  by 
■micro"  skills,  one  set  for  the  1-cave  warnings  of  bats  and  pits  and  the  set  for 

the  two-cave  warnings  of  the  Wumpus.  Our  philosophy  has  been  to  break  the  skill 
analysis  into  sufficiently  simple  rules  that  a model  which  only  recor  s thei.  presence 
or  absence  is  sufficient. 

It  would  bo  better  to  have  a general  theory  of  learning  that  suggesteo  typical 
bugs  that  might  occur  in  learning  a given  skill.  In  other  research,  the  Coach  project 
has  studied  the  theory  of  bugs  in  relation  to  different  kinds  of  plans  [niller  ft 
Goldstein  1976].  But  in  Wumpus  the  overall  plan  is  simple  — find  the  relative  dangers 
cave.  Hence,  we  find  that  we  are  able  to  model  the  student  without  an  elaborate 
bug  analysis.  Future  research  will  seek  to  couple  a theory  of  debugging  to  the  theory 
of  overlay  modelling. 


Given  this  analysis  of  the  fundamental  assumptions  of  overlay  modelling  and  its 
limitations,  there  are  clearly  four  situations  where  such  modelling  will  fail.  These 
are  situations  in  which  the  underlying  assumptions  of  these  modelling  rules  are 
violated. 

Extreme  Inconsistency  on  the  part  of  the  player:  the  P critic  will 
ultimately  set  the  KNOWN  variable  of  all  skills  to  ■?■. 

2*  Unrecognized  Expertise  employed  by  the  player:  again  the  P critic  will 
ultimately  turn  off  the  Psychologist,  unless  a Heta-Expert  is  available. 

Explanations  in  Complex  Verbal  Form:  natural  language  comprehension 
in  the  Coach  is  not  yet  implemented.  Explanations  expressed  in  English  by 
the  player  are  not  allowed. 

Distinguishing  first  order  from  second  order  bugs,  that  is,  distinguishing 
the  complete  absence  of  a skill  from  its  inappropriate  use.  Test  question; 
help  in  this  situation,  but  are  not  always  sufficient. 

However,  these  situations  would  also  task  the  abilities  of  a human  teacher. 


( 
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Despite  these  limitatons,  overlay  modelling  remains  useful  for  two  reasons.  The 
first  is  that  overlay  modelling  in  its  relation  to  explanation  Is  essentially  a 
linguistic  theory  of  the  Speaker.  Each  of  us,  when  formulating  an  explanation, 
abbreviates  the  explanation  in  accord  with  our  model  of  the  listener.  This  model  is 
based  on  our  analysis  of  the  listener's  behavior  in  terms  of  the  knowledge  we  believe 
Is  relevant.  Overlay  modelling  performs  a similar  function  for  the  Coach.  A human 
speaker  or  computer  coach  may  have  a mistaken  model  of  the  listener,  but  ultimately  a 
person  or  computer  can  Judge  another  only  in  terms  of  what  he,  she  or  it  knows  itself. 

The  second  reason  arises  from  the  special  demands  of  the  educational  context.  The 
Coach  is  not  an  impartial  observer,  but  rather  has  the  goal  of  conveying  its  style  of 
expertise.  Hence,  its  insight  into  the  student  can  be  useful,  even  if  limited  to 
hypotheses  regarding  which  aspects  of  its  expertise  the  student  possesses.  Its  goal  is 
to  convey  that  style  it  knows  about;  its  modelling  is  to  determine  how  much  of  that 
style  is  known. 


I 


6.  Experimental  Program 

The  fundamental  question  is  how  accurate  are  the  H and  L models  a^  estimates  of 
the  player's  knowledge  and  learning  preferences.  To  address  this  quest ^c>ns,  we  are 
employing  4 different  classes  of  experiment. 

1.  Turing  Tests;  Human  players  will  be  analyzed  by  interviewers  to  provide  benchmarks 
for  the  level  of  modelling  that  can  be  achieved  by  competent  humen  teachers.  In 
one  variation,  an  accomplice  will  be  asked  to  deliberately  simulate  cercain  student 
strategies,  and  the  ability  of  human  observers  to  detect  these  strategies  will  be 
Studied.  These  Turing  Tests  will  determine  if  the  Psychologist  module  provides 
modelling  performance  comparable  to  human  observers. 

2.  Articulate  Psychologist  Experiments;  Our  rule-based  approach  to  modelling  allows 
the  Psychologist  to  explain  its  hypotheses  by  reporting  which  rules  were  triggered 
and  by  what  evidence.  The  accuracy  of  these  self-explanations  will  . • Judged  by 
both  the  student  himself,  and  an  interviewer  who  observes  the  student's  play  and 
discusses  his  moves  with  him. 
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Closed  Loop  Experiments;  The  gene  will  be  played  by  a Bodified  version  of  the 
Expert  prograB  which  eaploys  a sub-optinal  strategy.  The  Psychologist  will  be 
Judged  by  whether  it  diagnoses  the  strategy. 

Predictive  Experiments;  An  overlay  modal  can  yield  a deterministic  procedural  model 
of  a player  by  deleting  all  rules  of  the  Expert  with  KNOWN  > NIL.  The  result  is  a 
simulated  player  that  can  be  used  to  predict  the  player's  performance.  The 
accuracy  of  these  predictions  will  provide  another  test  of  the  P^^vchologlst's 
success. 

To  date,  we  have  carried  out  informal  "articulate  psychologist”  experiments  with 
Wusor-II.  Players  over  a wide  spectrum  of  skill  find  the  comments  generated  by  the 
psychologist  module  to  be  comprehensible  and  reasonable,  as  evaluated  by  interviews 
with  the  players.  We  have  also  run  closed  loop  experiments  in  which  an  impartial 
player  consistently  employs  a sub-optimal  strategy.  WUSOR-II  successfully  diagnoses 
this.  We  are  currently  in  the  process  of  designing  simulated  players  to  serve  as 
rigorous  closed  loop  tests. 

We  plan  over  the  next  12  month  period  to  run  the  two  most  ambitious  classes  of 
experiments,  Turing  Tests  and  Predictive  Experiments.  Our  subject  population  will  be 
undergraduates  enrolled  in  an  education  major.  (We  will  be  interested  both  in  the 
success  WUSOR  has  in  coaching  these  students  and  in  their  reactions  to  WUSOR  as  an 
educational  tool.) 

In  summary,  we  are  encouraged  by  reactions  of  students  and  teachers  to  the  current 
state  of  the  Coach,  but  rigorous  evaluation  of  overlay  modelling  remains  to  be  done. 


7.  Related  Literature 

WEST ; The  WEST  program  by  Burton  and  Brown  [1976]  is  a computer  coach  for  the 
PLATO  game  "HOW  THE  WEST  WAS  WON".  In  this  game,  a player  must  form  from  three  numbers 
an  arithmetic  expression  whose  value  is  either  the  largest  possible,  or  .ccasionally  a 


given  number.  The  educational  purpose  of  the  game  is  to  provide  ex,  ^rience  with 


arithmetic  operators  and  the  use  of  parentheses.  C.  Resnick  [1975]  founo  that  many 
students  raached  plateaus,  such  that  they  failed  to  improve  their  sjcill,  alt.  ugh  they 
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I continued  to  enjoy  the  game.  The  WEST  coach  was  designed  to  discuss  less  than  opiimal 

I Boves  with  the  student  in  order  to  move  hia  or  her  off  such  a plateau. 

f 

I Burton  and  Brown  model  the  student  by  contrasting  his  or  her  move  to  the  move 

i recommended  by  the  expert.  USED  and  APPROPRIATE  variables  are  maintained  to  record  the 

I frequency  with  which  different  skills  are  employed.  Burton  and  Brown's  development  of 

i 

i this  modelling  technique  was  our  starting  point.  We  have  extended  their  approach  in 

I three  ways. 

I 

(1)  A syllabus  is  introduced  that  organizes  the  skills  in  a complexity/dependency 

' } 

^ graph.  For  complex  situations  this  is  required.  For  simpler  domains  with  a limited 

I 

[ number  of  skills,  it  is  less  important.  For  WEST,  the  syllabus  of  fig.  8 might  have 

: been  employed,  which  reflects  the  usual  order  in  which  arithmetic  skills  are  taught. 


(2)  A P critic  is  introduced  to  observe  discontinuities  and  inconsistencies.  This 
is  Important  to  observe  when  the  modelling  is  failing  to  capture  the  student's 
behavior.  It  could  readily  be  applied  to  the  WEST  case. 

(3)  Multiple  sources  of  evidence  are  used  to  increase  the  window  into  the 
student's  reasoning.  WEST  relied  solely  on  implicit  evidence  derived  from  the 
student's  play.  A facility  for  obtaining  explicit  evidence  through  follow  up  questions 
could  be  incorporated  into  the  WEST  coach. 
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BIP:  BIP  [Wescourt  1976]  is  a CAI  progran  for  tutoring  elementary  pr ngranuning 
skills.  We  mention  it  here  to  cite  an  alternative  to  Expert-based  overlay  modellings 
BIP  uses  a very  detailed  syllabus  as  does  overlay  modelling.  But  BIP  associates  with 
each  skill  in  the  syllabus  (called  the  Curriculum  Information  Network)  a set  of 
specific  exercises  and  a description  of  the  various  correct  and  incorrect  solutions.  A 
skill  Is  attributed  to  the  student  if  he  or  she  succeeds  at  these  exercises. 

The  virtue  of  this  approach  is  that  the  diagnosis  of  whether  a skill  is  employed 
is  much  simpler  to  make.  An  elaborate  domain  expert  is  not  needed.  The  disadvantage, 
however,  is  that  the  tasks  are  very  restrictive,  e.g.  a typical  one  might  be  to  "PRINT 
A LITERAL".  The  greater  complexity  of  overlay  modelling  with  respect  to  ..n  embedded 
Expert  program  is  required  to  allow  free  choice  by  the  student  in  more  complex  problem 
settings. 


Human  Problem  Solving;  Overlay  modelling  is  a potentially  valuable  tool  for 
Information  processing  psychology.  Hence  we  compare  it  here  done  by  Newell  and  Simor 
[1972]  and  their  colleagues.  Overlay  modelling  can  be  used  to  induce  a production 
system  model  of  a human  problem  solver.  The  required  Ingredient  is  an  Expert  that 
analyzes  the  problem  solver's  acts  in  terms  of  a set  of  constituent  skills  --  in  this 
case  a set  of  productions.  This  notion  of  comparing  a problem  solving  protocol  to  the 
behavior  of  a production  system  is  briefly  described  as  the  "trace"  feature  of  the  PAS- 
II  protocol  analysis  program  [Waterman  and  Newell  1973]. 

In  the  computer  coach  context,  we  have  not  attempted  the  level  of  detail  in 
modelling  that  Newell  and  Simon  seek,  wherein  even  eye  movements  must  be  accounted  for. 
The  Coach  does  not  have  that  much  information  regarding  the  student's  behavior. 
Indeed,  we  do  not  allow  unrestricted  English  Interaction.  In  the  PAS-II  protocol 
analysis  program,  English  is  permitted  by  making  the  program  interactive  --  l.e.  a 
human  analyst  can  aid  in  Interpreting  the  protocol.  Such  a solution  is  not  applicable 
to  the  real  time  demands  made  by  the  computer  coaching  context. 

Our  development  of  overlay  modelling  suggests  an  extension  to  prou  .ction  based 
modelling,  in  the  form  of  the  Syllabus.  The  productions  for  a given  problem  lomain  can 
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be  organized  into  a network  reflecting  complexity  and  dependency, 
suggests  the  order  in  which  the  productions  are  acquired. 


This  network  then 


8.  Conclusions 

Overlay  modelling  constitutes  a set  of  techniques  for  describing  a person's 
problem  solving  skills  in  terms  of  an  expert  program  for  the  task.  These  techniques 
are  rule  systems  for  monitoring  multiple  sources  of  evidence,  overlays  for  structuring 
the  model,  and  a critic  for  detecting  non-linearities.  This  approach  has  limitations, 
but  it  has  already  shown  itself  to  be  useful  for  maintaining  a model  of  the  learner's 
state  as  part  of  an  Al-based  CM  program. 

Ultimately,  progress  towards  an  improved  theory  of  modelling  will  have  an 
important  Impact  on  the  following  areas: 


1.  In  CAI  by  addressing  the  critical  need  to  model  the  learner  so  as  to 
provide  high  quality  personalized  instruction. 

2.  In  education  by  offering  overlays  as  a structural,  non-numerical  model  of 
the  student. 

3.  In  applied  AI  by  improving  the  ability  of  an  AI  program  employed  as  an 
intelligent  assistant  to  generate  appropriate  explanations  for  the  ^:^t^r. 

4.  In  theoretical  AI  by  defining  criteria  such  as  comprehensAbili^.;;  and 
modularity  that  expert  programs  should  satisfy  if  they  are  to  be  useful  as 
part  of  an  Al-based  CAI  systems. 

5.  In  information  processing  psychology  by  developing  a procedural  theor.,  for 
Inducing  models  of  a subject's  problem  solving  behavior. 
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